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Amendments to the Claims 

1 . (Currently Amended) A system, comprising; 

a psycho-physical state detection mechanism for detecting a psycho-physical stale 
of a user based on the input speech data fipom the user; and 

a spoken dialogue mechanism for carrying on a dialogue with im4 the user based 
on the psycho-physical state of the user, detected by the psycho-physical detection 
mechanism &om the input speech data from the user. 

2. (Currently Amended) The system according to claim 1 , wherein said the 
spoken dialogue mechanism comprises: 

a speech understanding mechanism for understanding the input speech data from 
the user based on the psycho-physical state of the user to generate a literal meaning of the 
input speech data; and 

a voice response generation mechanism for generating a voice response to the 
user based on the literal meaning of the input speech data and the psycho-physical state 
of the use r, wherein the voice response to the user is linguistically and acoustically 
adjusted accordinu to the detected psvcho-physical state of the user . 

3. (Currently Amended) The system according to claim 2, wherein said the 
speech understanding mechanism comprises: 

at least one acoustic model for characterizing the acoustic properties of the input 
speech data, each of said the at least one acoustic model corresponding to some distinct 
characteristic related to a psycho-physical state of a speaker; 

an acoustic model selection mechanism for selecting an acoustic model that is 
appropriate to according to the psycho-physical state detected by the psycho-physical 
state detection mechanism; 

a speech recognizer for generating a transcription of spoken words recognized 
from the input speech data using the acoustic model se ls e ct e d selected by the acoustic 
model selection mechanism; and 

a language understanding mechanism for interpreting the literal meaning of the 
input speech data based on the transcription. 

4. (Currently Amended) The system according to claim 2, wherein said the voice 
response generation mechanism comprises: 

a natural language response generator for generating a response based on an 
understanding of the transcription, said the response being generated appropriately 
according to the psycho-physical state of the user; 
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a prosodic pattern determining mechanism for determining tfee a prosodic pattern 
to be applied to said the response that is considered appropriate according to the 
psycho-physical state; and 

a lexl-lo-speech engine for synthesiziTig the voice response based on fjakl the 
response and f^aid the prosodic pattern. 

5. (Currently Amended) The system according to claim 1 , wherein seid the 
psycho-physical state detection mechanism comprises: 

an acoustic feature extractor for extracting acoustic features from the input speech 
data to generate ai least one acoustic feature; and 

a psycho-physical stale classjAer for classifying the input speech data into one or 
more psycho-physical states based on said the at least one acoustic feature. 

6. (Original) The system according lo claim 5, further comprising: 

at least one psycho-physical state model, each of said the at least one psycho- 
physical state model corresponding to a single psyeho-physical state and characterizing 
Ae acoustic properties of the single psyeho-physical state; and 

an off-line training mechanism for establishing said the at least one psycho- 
physical model based on labeled training speech data. 

7. (Currently Amended) The system according lo claim 1, further comprising a 
dialogue manager that to control the dialogue flow. 

8. (Cancelled) 

9. (Cancelled) 

10. (Currently Amended) A method, comprising: 

receiving, by a psycho-physicai state detection nnechanism, input speech data 
from a user; 

detecting ^ a psycho-physical state of the user from the input speech data; 

understandings by a speech imderstanding mechanism, th» a literal meaning of 
spoken words recognized from the input speech data based on the psycho-physical state 
of the user, detected by «ai4 the detecling; and 

generating, by a voice response generation mechanism, a voice response to the 
user based on the literal meaning of the input .speech data and the psycho-physical state 
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of the ase r, wherein the voice re^onse to the user is linguistically and acoustically 
adiuKled according to the delected psycho-physical state of the user . 

1 1 . (Currently. Amended) The method according to claim 1 0, wherein ^iaid the 
detecting comprises: 

extracting, by an acoustic feature extractor, at least one acoustic feature from the 
input speech data; and 

classifying, by a psycho-physical state classifier and based on said at least one 
acoustic feature, the input speech data into the psycho-physical slate according to at least 
one psycho-physical state model. 

12, (Currently Amended) The method according to claim 1 1 , further comprising: 

receiving, by an off-line training mechanism, labeled training data, wherein each 
of the data items in saM the labeled training data is labeled by a psycho-physical state; 
and 

building said the at least one psycho-physical state model using the labeled 
training data, each of the at least one psycho-physical state model corresponding to a 
single psycho-physical state and being established based on the data items in the labeled 
training data that have a label corresponding to the single psycho-physical stale, 

13. (Currently Amended) The method according to claim 1 0, wherein $aid the 
understanding comprises: 

selecting, by an acoastic model selection mechanism, an acoustic model, from at 
least one acoustic model, that is appropriate to according to the psycho-physical state, 
detected by said the detecting, each of i^md the at least one acowttic model corresponding 
to some distinct speech characteristic related to a the psycho-physical state; 

recognizing, by a speech recognizer, the spoken words from the input speech data 
using the acoustic models selected by said the selecting, to generate a transcription; and 

interpreting, by a language understanding mechanism, the literal meaning of the 
spoken words based on the transcription. 

14, (Currently Amended) The method according to claim 10, wherein said the 
generating comprises: 

constructing, by a natural language response generator, a natural language 
response based on an understanding of the transcription, said the natural language 
response being constructed appropriately according to the psycho-physical state of the 
user; 
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determining, by a prosodic pattern determining mechanism, tf»e a prosodic pattern 
to be applied to said natural longuiag e language response, wherein the prosodic pattern is 
considered to be appropriate according to the psycho-physical state; and 

synthesizing, hy a texl-to-spccch engine, the voice response based on sftkl the 
natural language response and ddid the prosodic pattern. 

15. (Cancelled) 

16. (Cancelled) 

17. (Cancelled) 

1 8. (Currently Amended) A computer-readable medium encoded with a program, 
said program comprising instructions that when executed by a computer cause the 
computer to : 

receive rooeiving , by a psycho-physical state detection mechanism, input speech 
data from a user; 

detecting the a psycho-physical state of the user from the input speech data; 

understanding, by a speech understanding mechanism, the a literal meaning of 
spoken words recognized from the input speech data based on the psycho-physicul $tate 
of the use r, d e t e ct e d by s aid d e t e cting ; and 

generate geacfQting y by a voice response generation mechanism mecahnism ^ a 
voice response to the user based on the literal meaning of the input speech data and the 
psycho-physical state of the use n wherein the voice response to the user is linguistically 
and acoustically adjusted according to the detected psycho-physical state of the user . 

19. (Currently Amended) The medium according to claim 1 8, wherein «h4 the to 
detecting comprises instructions that when executed by the computer cause the computer 
to: 

extractiftg, by a acoustic feature extractor, at least one acoustic feature from the 
input speech data; and 

classiiying, by a psycho-physical state classifier and based on said the at least one 
feature, the input speech data into the psycho-physical state according to at least one 
psycho-physical state model. 

20. (Currently Amended) The medium according to claim 1 9, further comprising 
instructions that when executed by the computer cause the computer to : 
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receive r e c e iving , by an off-line training mechanism, labeled training data, 
wherein each of the data items in said the labeled training data is labeled by a psycho- 
physical state; and 

building said the at least one psycho-physical state model using the labeled 
training data, each of the at least one psycho-physical state model corresponding to a 
single psycho-physical slale and being established based on the data items in the labeled 
training data that have a label corresponding to the single psycho-physical state. 

2 1 . (Currently Amended) ITie medium according to claim 1 J?, wherein said the 
understanding comprises instructions that when executed by the computer cause the 
computer to : 

selecting, by an acoustic model selection mechanism^ an acoustic model, from at 
least one acoustic model, that is appropriate to according to the psycho-physical state, 
det e ct e d by said detectings each of said tihg at least one acoustic model corresponding to 
some distinct speech characteristic related to a psycho- physical state; 

recognize r e cognizing , by a speech recognizer, the spoken words from the input 
speech data using the acoustic model, s e l e cted by said solocting, to generate a 
transcriplion; and 

intexDret interpreting , by a language understanding mechanism, the literal 
meaning of the spoken words based on the transcription. 

22. (Currently Amended) The medium according to claim 1 8, wherein said the to 
generate gen e rating comprises instructions that when executed by the computer cause the 
computer to : 

constructing, by a natural language response generator, a natural language 
response based on an understanding of the transcription, said the natural language 
response being cotislructed appropriately according to the psycho-physical state of the 
user; 

determine d e termin ing, by a prosodic pattern determining mechanism, the a 
prosodic pattern to be applied to said the natural longuiag e language response, wherein 
the prosodic pattern is considered to be appropriate according to the psycho-physical 
State; and 

synthesize s ynth e siy^ing , by a text-to-speech engine, the voice response based on 
said the natural language response and said the prosodic pattern. 
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